Incremental OPTICS: Efficient Computation of Updates in a Hierarchical Cluster Ordering

نویسندگان

  • Hans-Peter Kriegel
  • Peer Kröger
  • Irina Gotlibovich
چکیده

Data warehouses are a challenging field of application for data mining tasks such as clustering. Usually, updates are collected and applied to the data warehouse periodically in a batch mode. As a consequence, all mined patterns discovered in the data warehouse (e.g. clustering structures) have to be updated as well. In this paper, we present a method for incrementally updating the clustering structure computed by the hierarchical clustering algorithm OPTICS. We determine the parts of the cluster ordering that are affected by update operations and develop efficient algorithms that incrementally update an existing cluster ordering. A performance evaluation of incremental OPTICS based on synthetic datasets as well as on a real-world dataset demonstrates that incremental OPTICS gains significant speed-up factors over OPTICS for update operations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Local Feature Selection in Incremental Clustering

In this paper we describe a preliminary study into the use of feature selection in incremental hierarchical clustering. Our aim is to add this capability to the clustering system, still maintaining the in-cremental nature of the learning process. This constraint lead us to consider a dynamic feature selection mechanism which is performed parallel to the clustering process. In addition, feature ...

متن کامل

An Incremental Approach to Building a Cluster Hierarchy

In this paper we present a novel Incremental Hierarchical Clustering (IHC) algorithm. Our approach aims to construct a hierarchy that satisfies the homogeneity and the monotonicity properties. Working in a bottom-up fashion, a new instance is placed in the hierarchy and a sequence of hierarchy restructuring process is performed only in regions that have been affected by the presence of the new ...

متن کامل

Incremental Shared Nearest Neighbor Density-Based Clustering Algorithms for Dynamic Datasets

Dynamic datasets undergo frequent changes where small number of data points are added and deleted. Such dynamic datasets are frequently encountered in many real world applications such as search engines and recommender systems. Incremental data mining algorithms process these updates to datasets efficiently to avoid redundant computation. Shared nearest neighbor density based clustering (SNN-DB...

متن کامل

Batch Incremental Shared Nearest Neighbor Density Based Clustering Algorithm for Dynamic Datasets

Incremental data mining algorithms process frequent updates to dynamic datasets efficiently by avoiding redundant computation. Existing incremental extension to shared nearest neighbor density based clustering (SNND) algorithm cannot handle deletions to dataset and handles insertions only one point at a time. We present an incremental algorithm to overcome both these bottlenecks by efficiently ...

متن کامل

Incremental parallel and distributed systems

Incremental computation strives for efficient successive runs of applications by reexecuting only those parts of the computation that are affected by a given input change instead of recomputing everything from scratch. To realize the benefits of incremental computation, researchers and practitioners are developing new systems where the application programmer can provide an efficient update mech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003